Skip to content

Conversation

@martinduke
Copy link
Contributor

@martinduke martinduke commented Oct 24, 2025

Fixes the request error part of #1175.

Fixes: #1231

Copy link
Collaborator

@afrind afrind left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this proposal.

This also fixes #1256.

@afrind afrind added Wire Format Related to how messages are serialized and parsed Design Issues or PRs that change how MoQ works including the wire format. labels Oct 24, 2025
Copy link
Collaborator

@ianswett ianswett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks sensible, but I'd prefer to not split error codes and make the retry-after always optional

Length (16),
Request ID (i),
Error Code (i),
[Retry Interval (i)],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In terms of framing, I think I'd prefer to not split error codes and instead make the Retry Interval 'optional' by reserving 0 as a special value that's equivalent to it not being present.

This also avoids us having a discussion about which of the error codes should be retryable.

Comment on lines 2237 to 2239
If a Retry Interval exceeds the lifetime of a necessary authentication token
used in the request, so that a retry at that time would fail, the sender SHOULD
use an error code indicating a new authentication token is needed.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really mean this? Like if I get a DOES_NOT_EXIST with a valid auth token, and Retry-After 30 (since it will exist then), but the token expires in 15, I change the error to UNAUTHORIZED?

That seems super confusing. I think I'd rather get the real error code, and burn the round trip when my token actually expires.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am inclined to remove this entirely.

@afrind afrind changed the title Error codes Add Retry Interval to REQUEST_ERROR Nov 8, 2025
Copy link
Collaborator

@ianswett ianswett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some small comments, but this looks close.

Copy link
Collaborator

@ianswett ianswett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM

Copy link
Collaborator

@fluffy fluffy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should split out the stuff on code point allocation and put in separate PR that forms the IANA consideration section.

| INVALID_JOINING_REQUEST_ID | 0x32 | {{message-request-error}} |
| UNKNOWN_STATUS_IN_RANGE | 0x33 | {{message-request-error}} |

The range of error codes including 0x100 to 0xffff is reserved for
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not keen on implementation specific error code. I think that contributes to non interoperable implementations. I'd like to have some discussion on this.

The range of error codes including 0x100 to 0xffff is reserved for
implementation-speceific codes.

The range of error codes starting with 0x10000 is reserved for
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we right this up in the normal way we would IANA. This should be there is a way to get IANA to pre allocate a code.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any reason we need to invent new IANA processes beyond what is in rfc8126

sending more than one such message within a second or so across one or more
sessions, it SHOULD apply randomization to each retry interval so that retries
are spread out over time, minimizing the risk of synchronized retry storms. A
Retry Interval value of 1 indicates the request can be retried immediately.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it needs to do randomization even if it is not doing multiple requests. A common problem is Start of Service attacks. With IP phones we regularly see cities of over 10 million people loose power, then all restart at the same time often all with a set top box from of some sort from the same vendor. This forms a Start Of Service attack where the normal mitigations is for the server to be able to respond with retry after and it is more robust to spreading out the load if that can be spread.

My suggestion would be just make this very specific that the min time to respond was a number between the values returned and 125% of value returned with uniform random distribution in that time. Sure some clients won't bother but the ones that cause SoS attacks will.

* Error Code: Identifies an integer error code for request failure.

* Retry Interval: The minimum time (in seconds) before the request SHOULD be
sent again, plus one. If the value is 0, the request SHOULD NOT be retried.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just as something to think about ... ff you use seconds instead of ms, it makes it hard for the server to pace multiple clients but I can see how either would work.

@afrind
Copy link
Collaborator

afrind commented Nov 16, 2025

I added this to the agenda for tomorrow's meeting (11/17) and in I would like to get a show of hands on whether we will or will not support implementation defined codes.

@afrind afrind added the Needs Discussion Tags for issues which need discussion, ideally at an interim or IETF meeting. label Nov 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Design Issues or PRs that change how MoQ works including the wire format. Needs Discussion Tags for issues which need discussion, ideally at an interim or IETF meeting. Wire Format Related to how messages are serialized and parsed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Are error codes best-effort or mandatory?

4 participants